Microsoft Unveils Powerful AI 'Small Language Model' for Researchers


Microsoft Unveils Powerful AI 'Small Language Model' for Researchers
Microsoft has recently launched its latest compact language model named Phi-2, demonstrating comparable or superior performance to specific larger open-source Llama 2 models despite having fewer than 13 billion parameters. The Machine Learning Foundations team at Microsoft Research has introduced a series of small language models (SLMs) known as "Phi," showcasing impressive results across various benchmarks in the past few months.
The first model, the 1.3 billion parameter Phi-1 achieved state-of-the-art performance on Python coding among existing SLMs (specifically on the HumanEval and MBPP benchmarks). "We are now releasing Phi-2, a 2.7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters", the company said in an update.
Phi-2 provides an excellent research environment, serving as an optimal platform for researchers to explore mechanistic interpretability, enhance safety measures, and conduct fine-tuning experiments across a diverse range of tasks. "We have made Phi-2 available in the Azure AI Studio model catalog to foster research and development on language models", said Microsoft. The massive increase in the size of language models to hundreds of billions of parameters has unlocked a host of emerging capabilities that have redefined the landscape of natural language processing.
However, a lingering question persists regarding the attainability of such emerging capabilities on a smaller scale through strategic training choices, such as data selection. "Our line of work with the Phi models aims to answer this question by training SLMs that achieve performance on par with models of much higher scale (yet still far from the frontier models)", said Microsoft. The company has also performed extensive testing on commonly used prompts from the research community. "We observed a behaviour in accordance with the expectation we had given the benchmark results", said the tech giant.
Source: IANS